Learning Attentive Representations for Environmental Sound Classification
نویسندگان
چکیده
منابع مشابه
Learning Representations for Relation Classification
Knowledge bases can be applied to a wide variety of tasks such as search and question answering, however they are plagued by the problem of incompleteness. In this project, we propose two models for automated relation classification using extracted entity pairs and related sentences from natural text. We evaluate both models on a portion of the Stanford KBP dataset across 38 relations, achievin...
متن کاملComparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks
Recent successful applications of convolutional neural networks (CNNs) to audio classification and speech recognition have motivated the search for better input representations for more efficient training. Visual displays of an audio signal, through various time-frequency representations such as spectrograms offer a rich representation of the temporal and spectral structure of the original sign...
متن کاملAttentive Classification
In this paper, we present a two-step approach for object recognition based on principles of human perception: Attentive Classification. First, regions of interest are detected by a biologically motivated attention system. Second, these regions are analyzed by a fast classifier based on the Adaboost learning technique. Thus, the classification effort is restricted to a small data subset. The app...
متن کاملAttentive Tracking of Sound Sources
Auditory scenes often contain concurrent sound sources, but listeners are typically interested in just one of these and must somehow select it for further processing. One challenge is that real-world sounds such as speech vary over time and as a consequence often cannot be separated or selected based on particular values of their features (e.g., high pitch). Here we show that human listeners ca...
متن کاملSoundNet: Learning Sound Representations from Unlabeled Video
We learn rich natural sound representations by capitalizing on large amounts of unlabeled sound data collected in the wild. We leverage the natural synchronization between vision and sound to learn an acoustic representation using two-million unlabeled videos. Unlabeled video has the advantage that it can be economically acquired at massive scales, yet contains useful signals about natural soun...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2019
ISSN: 2169-3536
DOI: 10.1109/access.2019.2939495